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Abstract 

In this work we propose a generalization of the Hadamard product 
between two matrices to a tensor- valued, multi-linear product between 
k matrices for any k > 1. A multi-linear dual operator to the gener- 
alized Hadamard product is presented. It is a natural generalization 
of the Diagx operator, that maps a vector x G M. n into the diago- 
nal matrix with x on its main diagonal. Defining an action of the 
n x n orthogonal matrices on the space of fc-dimensional tensors, we 
investigate its interactions with the generalized Hadamard product 
and its dual. The research is motivated, as illustrated throughout 
the paper, by the apparent suitability of this language to describe 
the higher-order derivatives of spectral functions and the tools needed 
to compute them. For more on the later we refer the reader to |14j 
and JB]) where we use the language and properties developed here to 
study the higher-order derivatives of spectral functions. 
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1 Introduction 



Spectral functions, are functions on a symmetric matrix argument invariant 
under a closed subgroup of the orthogonal group on the space of all n x n 
symmetric matrices, S n . More precisely, F : S n — *■ R is spectral if 

F(U T XU) = F(X), 

for all X G S n and U G 0(n) — the orthogonal group on M. n . It is not 
difficult to see that such functions can be represented as the composition 

F = fo\, 

where / : W l — > K is a symmetric function (f(Px) = f(x) for any permu- 
tation matrix P and vector x), and A : S n — > IR n is the eigenvalue map: 
A(X) = (Ai(X), A n (X)) — all eigenvalues of X. We will assume through- 
out that, 

Ai(X)>--->A n (X). 

The study of spectral functions generalizes the study of the individual 
eigenvalues of a symmetric matrix since if we let 

(j) k (x) : R n -> E, 

:= the /c th largest element of {x±, ...,x n }, 

then 4>k{x) is symmetric and 

A fc (X) = (^oA)(A). 

Various smoothness properties of eigenvalues have been studied for some 
time now and find a lot of applications in areas ranging from matrix perturba- 
tion theory [TH], and eigenvalue optimization 0, [H], to quantum mechanics 
[3]. The Taylor expansion (when it exists) of the eigenvalues of symmetric 
matrices depending on one scalar parameter are described in the monograph 
by Kato This naturally raises the questions about the differentiability 
properties of the spectral functions. Many such questions have already been 
investigated in the literature (see below) and surprisingly the answers to most 
of them follow the same pattern: / o A has a property at the matrix X if, 
and only if, / has the same property at the vector \(X). It is only natural, 
then to try to describe the differentials of / o A in terms of the differentials 
of the simpler function /. 
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Here is a list of properties for which foX has that property at (or around) 
the matrix X if and only if / has the same property at (or around) the vector 
X(X). 

(i) F is lower semicontinuous at A if, and only if, / is at X(A), [5]. 

(ii) F is lower semicontinuous and convex if, and only if, / is, j2], [5]. 

(iii) The symmetric function corresponding to the Fenchel conjugate of F 
is the Fenchel conjugate of /, [T^j, 0. (A similar statement holds for 
the recession function of F, [T5].) 

(iv) F is pointed, has good asymptotic behaviour or is a barrier function 
on the set X~ l (C) if, and only if, / is such on C, [T3] . 

(v) F is Lipschitz around A if, and only if, / is such around A (A), [S| 

(vi) F is (continuously) differentiate at A if, and only if, / is at X(A), [0]. 

(vii) F is strictly differentiate at A if, and only if, / is at X(A), [Hj, [Z|- 

(viii) V(/o A) is semismooth at X if, and only if, V/ is at X(X), [T2] . 

(ix) If / is l.s.c. and convex, then F is twice epi-differentiable at A relatively 
to Q if, and only if, / is twice epi-differentiable at X(A) relative to A(f2), 
|17j . where Q is an arbitrary epi-gradient. 

(x) F has a quadratic expansion at X if, and only if, / has a quadratic 
expansion at X(X), 

(xi) F is twice (continuously) differentiate at X if, and only if, / is twice 
(continuously) differentiate at X(X), [TU] . 

(xii) F G C°° at A f e C°° at X(A), 0. 

(xiii) F is analytic at A if, and only if, / is at X(A), [T8] . 

(xiv) F is a polynomial of the entries of A if, and only if, / is a polyno- 
mial. This is a consequence of the Chevalley Restriction Theorem, ^JJ 
p. 143]. 
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We want to stress that there are exceptions to the pattern. For example 
if / is directionally different iable at X(X) this doesn't imply that / o A is 
such at X, see 0. 

In jH] and JU| the authors gave explicit formulae for the gradient and the 
Hessian of the spectral function F in terms of the derivatives of the symmetric 
function /. In order to reproduce them here we need a bit more notation. 
For any vector x in IR n , Diagx will denote the diagonal matrix with vector 
x on the main diagonal, and diag : M n — > M. n will denote its dual operator 
defined by diag (X) = (xn, x nn ). Recall that the Hadamard product of 
two matrices A = [A l i] and B = [B l i] of the same size is the matrix of their 
element- wise product Ao B = [A^B^]. Thus we have 

(1) V(/ o X)(X) = V{BmgVf(X(X)))V T , and 

(2) V 2 (foX)(X)[H 1: H 2 ] = V 2 f(\(X))[diagH 1 ,diagH 2 ] + 

+ (A(X(X)),H 1 oH 2 ), 

where V is any orthogonal matrix such that X = V (Diag X(X)^V T is the 
ordered spectral decomposition of X; Hi = V T HiV for i — 1,2, and x e 
]R n — > A(x) is a matrix valued map that is continuous if V 2 /(x) is. 

In JTU] a conjecture was made that F is /c-times (continuously) differen- 
tiable at A if, and only if, / is such at X(A). It is conceivable that high- 
powered analytical methods may give a direct proof of this conjecture, but 
never the less an interesting question is what the k th differential of F looks 
like and how to compute it practically. Explicit formula for the k th differen- 
tial of F will generalize the formula for the k th term in the Taylor expansion 
(when it exists) of the individual eigenvalues given in [3]. 

Before attacking the questions in the previous paragraph we need to an- 
swer several more basic questions. What are the common features in Formu- 
lae (0) and (J2J), that we expect to generalize when we further differentiate? 
We propose a language that shows a good promise to simplify the description 
of the higher order derivatives of spectral functions. It is based on the idea 
of generalizing the Hadamard product of two matrices to a /c-tensor valued 
product between k matrices. The current paper is the first of three. It defines 
what we mean by a generalized Hadamard product and investigates some of 
its multi-linear algebraic properties. In [T^j we will formulate calculus-type 
rules for the interaction between the generalized Hadamard product and the 
eigenvalues of symmetric matrices. Finally, in we will describe how to 
compute the derivatives of spectral functions in two important cases. In 
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particular, we will show that Conjecture 14.11 holds for the derivatives of any 
spectral function at a symmetric matrix with distinct eigenvalues, as well as 
for the derivatives of separable spectral functions at an arbitrary symmetric 
matrix. (Separable spectral functions are those arising from symmetric func- 
tions f(x) = g(xi) + • ■ ■ + g(x n ) for some function g on a scalar argument.) 



2 Generalizations of the Hadamard product 

By {H pq : 1 < p, q < n} we will denote the standard basis of the space of 
all n x n matrices. That is, the matrices H pq are such that (H pq )'^ is 1 if 
(hj) = (P)?); an d otherwise. 

Let us look closely at the Hadamard product, Hi o H 2 , between two ma- 
trices Hi and H 2 from M n . It is a matrix valued function on two matrix 
arguments, linear in each argument separately. Therefore it is uniquely de- 
termined by its values on the pairs of basic matrices (H piqi , H p2q2 ). 

On such basic pairs the Hadamard product is defined as: 



' u o H Y J 



1, if i = pi = p 2 and j = q x = q 2 , 
0, otherwise. 



Naturally, we may define the cross Hadamard product by the rule 

1, if % = pi = q 2 and j = p 2 = qi, 



{Hpxqi ° (12) H P2q 2 r | Qj otherwise 

and then extend this to a bilinear function on all M n x M n . The Hadamard 
product and the cross Hadamard product are essentially the same thing: 

Hpiqi ° (12 ) H P2q2 — H piqi o H p2q2 — H piqi o H q2P2 . 

These observations can be naturally generalized in the following way. 
Denote the set {1, 2, k} of the first k natural numbers by A k-tensor 
on M. n is a real- valued map on M. n x • • • x IR n (fc-times) linear in each argument 
separately. When a basis in M. n is fixed, a ^-tensor can be viewed as an 
n x • • • x n (fc-times) "block" of numbers. We will index the elements of a 
tensor just like we index the entries of a matrix. The space of all /c-tensors 
on W 1 will be denoted by T k ' n . 
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Definition 2.1 For a fixed permutation a on we define a-Hadamard 
product between k matrices to be a /c-tensor on M. n as follows. Given any k 
basic matrices H Piqi , H P2q2 ,...,H Pkqk we define: 

( TJ n M n n yi«2...*fc_ i ^' ^ = ^ = 9a-(s)j Vs = 1, fc, 

^ Pl9l o CT w P2(?2 o a 'WJ "\ 0, otherwise. 

Now, extend this product to a A;-tensor valued map on k matrix arguments, 
linear in each of them separately. 

Another way to write the above definition is using the Kronecker delta. 
Recall that 5ij is equal to 1 if % — j, and otherwise. Thus, 

(3) ^ Piqi °°~ ^PM 2 °<r " ' °<r H pk q k ) 12 k = <5j ipi <5i 1(?(T(1) ' ■ - ^i k p k ^i k q a ( k) 

= Oi lPl O piqcrm ■ ■ ■ fii k p k Sp k q a ( k) - 

The next lemma gives the formula for the general entry of the a-Hadamard 
product between arbitrary matrices. 

Lemma 2.2 The a-Hadamard product of arbitrary matrices is given by 
(H, o a H 2 o a ... o a = flf ""^ ■ ■ ■ H^ 1 ^ 

n-V(i)«i rrV(fe)ife 
- U a(l) ""^(fc) ■ 

Proof. Let a be a permutation on f% and let Hi,...,H^ be arbitrary 
matrices. Using the definition that the product is linear in each argument 
separately, we compute 

(H, o a H 2 o a ... o a H k y^ 

n,n n,n 

E... \^ . . . UPklklTJ tt . rr \hi2-i k 

/ j n \ n k \ n V\11 n V'iq2 U cr u cr n p k q k ) 

pi,gi=l Pfc,9fc=l 
n.n n,n 

E... \^ frPiQi . . . ffPhghs. x. . . x. x, 
/ j l± i k u npi u nq CT (i) u ikPk u ikq<r( k ) 

Pl .91=1 p k ,q k =l 
n,n n,n 

Pl.9l=l Pfc,9fc=l 

_ ff* 1 ^ -1 ™ . . . H^ l<T ' 1{k) 

7T>d(l)'l rj-io-(fe)H 

_il (T(l) ""• H «r(fc) • 
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Corollary 2.3 When the first k — 1 of the matrices involved in the product 
are basic we get 

(H o • • • o H o H) ili2 - ik 

\ 11 piqi u cr u cr 11 p k _ 1 q k ^. 1 u cr 11 ) 

— A. A. ...A. A. H ia ^ il 5- 5- ■■■5- 5- 

where I = a~ l (k). 

Proof. Let I = a~ 1 (k), using the result of the previous lemma we calculate. 

(H o •••o H o H) hh - ik - H tll,T ~ 1{1) ■ ■ . ff^-^-Hk-i) Tji k i a - Hk) 

— A. A. ...A. A. ff^V-Hfc) 

u «lPl u V-l(l)'3l "ife-lPfc-i'^V-ltfc-lj'Jfc-l 

— A. A. ...A. A. H ia ^ il 5- 5- ■••A- A- ■ 

The above corollary can be easily modified when the matrix H is in 
arbitrary position in the product. 

Example 2.4 We already saw that, when k = 2 and a = (12) the a- 
Hadamard product is essentially the ordinary Hadamard product: 

H x o (i2) R 2 = H x o El. 

If we restrict our attention to the space of symmetric matrices, then the two 
products coincide. In the case when a = (1)(2) we get 

^io (1)(2) // 2 = (diagi/ 1 )(diagif 2 ) T . 

Example 2.5 The cr-Hadamard product has meaning even when k = 1. In 
that case, there is just one permutation on the set Ni and the a-Hadamard 
product corresponding to it has one matrix argument and returns, by def- 
inition, a vector (1-tensor). Since a = (1), extending the notation, the 
cr-Hadamard product is given by the rule: 



1, if i 1=Pl = qi 
0, otherwise 



= (diagtf^J* 1 . 

Extending by linearity we get 

o a H = diag H. 
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For any two A;-tensors, T±, and T 2 we define a scalar product between 
them in the natural way: 



(T 1 ,T 2 )= Ti 1 ~ ih Ti 1 " i ". 

ii,...,tfc=l 

Lemma 2.6 Let T be a k-tensor on W 1 , and H be a matrix in M n . Let 
H Piqi! ... ! Hp k _ iqk l be basic matrices in M n , and let a be a permutation on 
Njfc. Then the following identities hold. 

(i) Ifa~\k) = k, then 



k-l 



(T H o ■■■o H o H)=(Y\S T pi - pk - lt H tt 

\1 ) 1A piqi U (T u cr 11 p k -iqk-l cr H I \ | | u ptq<r(t) J / j 1 11 ■ 

t=l t=l 

(ii) 7/cr~ 1 (A;) = I, where I ^ k, then 

k-l 

IT H o ■ ■ ■ o H o H) = ( TT 5 }TPi-Pk-i<ia(k) Tji<T{k)P a -Hk) 

t=l 
t^l 

Proof. Using the definitions and observation (JHJ), we calculate. 
(T, H piqi o a H P2q2 o a ■ ■ ■ O a Hp^g^ o a H) 

n,n 

EijPkqk IT H o H o ■ ■ ■ o H o H ) 

11 \1 1 Hpiqi u cr Hpiqi u cr u cr Hpk-iqk-1 cr llp k q k l 

Pfc,<?fc = l 

n.n n,...,n 

Eup k q k \^ T'h-ik(ii u o ... H o H V 1 -** 

11 \ AI Piqi u o- llp^qi u o- u cr llpk-iqk-l ° 11 Pk.qk) 

Pk,qk=l ii,...,i k =l 



n,...,n 



Pkqcr(k) 



EUPkqk \T •/•' X. A ...A. A 

iJ / j A u npi u piq a (i) u i k Pk u i 

p k ,q k =1 h,...,i k =i 

n,n 

uPkq k TPi~-PkX ...A 
~~ 1J J °pi9 CT (i> u p k q<r( k )- 

The result follows easily by considering the two cases separately. 
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3 A partial order on P and one property of 
the (j-Hadamard product 

Given two permutations a, /J, on Nk, we say that a refines \x if for every 
s G Nfc there is an r G such that 

{a\s) : l = l,2,...}C{n l (r) : f = 1,2,...}, 

where cr'(s) = cr(cr(- ■ • (er(s)) •••)-/ times. 

In other words a refines \i if every cycle of a is contained in a cycle of \i. 
Clearly the cycles of a will partition the cycles of \i. If a refines \x we will 
denote it by 

fj, Z< a. 

The set of all permutations on as well as the set of all n x n permutation 
matrices will be denoted by P h . Clearly the refinement is a pre-order on 
P k (it is reflexive, transitive, but not antisymmetric). With respect to this 
pre-order, the identity permutation is the biggest element (that is, bigger 
that any one else) and every permutation with only one cycle is a smallest 
element (that is, it is smaller than any other element). 

There is a natural map between the set P k and the diagonal subspaces of 
]R fc , given as follows: 

V{a) = {x G R k : x s = x a{s) \/s G N k }. 

This map is onto but is not one-to-one since, for example, when k — 3 
£>((123)) = £>((132)) = {iGl 3 : x 1 = x 2 = x 3 }. Clearly the image of the 
identity permutation is M. h . The following relationship helps to visualize the 
partial order on P k 

H ^ a & V((j,) C V(a). 

Finally, given a tensor T G T k,n we may want to preserve the entries 
lying on a diagonal "subspace" of T and substitute the rest of the entries of 
T with zeros. In other words, given a permutation /i G P k , we introduce the 
notation P^(T) for the tensor in T k,n defined by 

\ 0, otherwise. 

After all these preparations, we can formulate the main result in this 
section. It describes when we can transfer diagonal "subspaces" of T between 
different cr-Hadamard products. 
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Theorem 3.1 Let a±, o~i, and p, be three permutations on Nfc. Then the 
identity 

{P^T),H X o CTl ■ ■ ■ o ai H k ) = (P M (T), H x o a2 ... o CT2 

holds for any matrices Hi,...,H k , and any tensor T in T k,n if, and only if, 
fi ^ a 2 x <T\. 

Proof. Since both sides are linear in each of the matrices Hi,..,Hk sepa- 
rately, it is enough to prove the theorem when these matrices are basic. In 
other words, we are going to show that 

(Pp(T), H piqi o CTl ■ ■ ■ o CTl H Pk g k ) = (P^(T),H piqi o CT2 • • ■ o CT2 H Pkqk ), 

for any indexes Pi,-..,Pk, ?i,-,?t, and for any T G T k ' n if, and only if, p ^ 
cr^ 1 o cj\. Direct calculation shows: 

(Pfi(T), H piqi o ai ■ ■ ■ o ai H Pkgk ) 

n,...,n 

= Yl (p,(T)r-^(H piqi o ai ...o ai H Pkgk r^ 



H,...,ife=l 
n,...,n 



- (fc) 



J2 (p,(T)r--- ik H piq 7 w ---H P j 

ii,...,t fc =l 
n,...,n 

= (Pn(P)) fe ^«iPi^i9 CTl (i) ' ' ' ^ikPk^kQcr^k) 

ii,...,tfc=l 

= (-Pm(^)) P1 Pfe ^Pl9 CTl (l) ' ' ' ^Pfc9 CTl (fc)- 

The last expression is equal to T Pl --- Pk when p s = p^U) — Q<n(s) f° r an s £ ^fcj 
and is equal to otherwise. 
Analogously we have 

{Pfl(T)l Hpiqi °<T2 ' ' ' °<J2 PpkQk) = (PfJ.CP)) Pl Pk ^PlQa 2 (l) ' ' ' ^Pfe?<r 2 (&)' 

which is equal to T Pl '" Pk when p s = p^ s ) = q a2 (s) for all s G N*., and is equal 
to otherwise. 

Suppose that p -< a^ 1 o o\. We consider three cases. 

If there is an s such that p so 7^ p^so), then both expressions are zero and 
we trivially have equality. 
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If p s = Pn( s ) f° r all s £ Nfc but for some s we have that p SQ 7^ <Z<ti(s )> then 
it is not possible to have p s = q U2 ( s ) for all s G f%. Indeed, suppose on the 
contrary that p s = q a2 ( s ) for all s G N*.. Letting r = 02 (s) we get p a -i( r ) = <? r 
for every r G Nfc. Therefore P (T j 1 ( tT1 (s)) = 9oi(s) for every s G f%. In particular 
Pa- 1 ^!^)) = *n(«o) ^ Ps - But ^ a^ 1 o 0-1 implies that ^(oi^o)) and s 
belong to the same cycle of fx, that is [i l (so) = crj" (<ti(so)) for some / G N. 
By the assumption in this case we have that p so = p M i( SQ ) for every I. This is 
a contradiction. Thus for some Si G Nk we have p si 7^ qw a:l ) and again we 
will have that both expressions are equal to zero. 

Suppose finally that p s = p M ( s ) = q ai ( s ) for all s E Nk- Then the first 
expression is equal to T Pl ~- Pk . If we show that p s = q a2 [ s ) for every s G Nk, 
then we will be done. Suppose this is not true, that is, for some s , p SQ 7^ 
q^iso)- Then for r = a 2 (s ) we will have P a -\ ro ) ^ q ro , and for Si = a' x " 1 (r ) 

we have P^-i^^)) 7^ &n(si)- Again /i ^ ovT 1 o en implies that (T2 1 (o~i(si)) 
and si belong to the same cycle of fi and we reach a contradiction as in the 
previous case. 

To prove the opposite direction of the theorem, suppose that 

(A) (P (T)) Pl - pk 5 •••5 = (P (T)) pl - Pk S ■■■5 

for every choice of the indexes Pi,---,Pk and qi,...,qk and every T. Take T to 
be such that T ll " Ak 7^ for every choice of the indexes i\,...,ik satisfying 
i s = V(s) for every s G Suppose that [i ^ ovT 1 o u\. This means that 
there is an number so G Nk such that a^ 1 (ci(so)) and so are not in the same 
cycle of /i. Choose the indexes pi,...,pk and qi,...,qk so that p s = p^u) and 
p s = q ai ( s ), f° r every s G N fe . Moreover, choose the indexes pi,...,pk so that 
if s,r G Nfc are not in the same cycle of //, then p s 7^ p r . This in particular 
means that p CT -i ((7l(so)) ^ p So . 

With the choices so made, the left-hand side of Equation (j3J) will be equal 
to T n ' ife 7^ 0. We will reach a contradiction if we show that for some ro, 
Pro 7^ 9cr 2 (r- ); since then the right-hand side of Equation (J3|) will be zero. 
Suppose on the contrary that p r = q a2 ( r ) for every r G N fc . Then, 

Pa-jVW) = = P«> for ever y seN k . 

Substitute above s = s to reach a contradiction. Thus, p ro 7^ q ai2 ( ro ) f° r some 
r G and we are done. ■ 
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Notice that if \i -< u, then for arbitrary permutation a in P k we have 

(a o o a. 
This observation leads to the next corollary. 



Corollary 3.2 Suppose fi and v are permutations in P n such that fi ^ v. 
Then for arbitrary permutation a £ P k , any matrices Hi,...,H k , and a tensor 
T in T k,n we have the identity: 

(P^T^Hx o a ■ ■ ■ o a H k ) = (F M (T), H x o aov ■ ■ ■ o aou H k ). 

In particular, the result holds when v = \x or v = /j, -1 . 

It will be useful to see what are the conclusions of the above theorem 
when k < 3. We summarize them in the next corollary. 

Corollary 3.3 For any T £ T 2,n and any two matrices Hi and H 2 we have 

(P (12) (T),Hi o ww H 2 ) = (P^nH, o (i2) H 2 ). 

For any T £ T 3,n and any three matrices H\, H 2 , and H 3 we have 

(^i3)( T )> #1 °(i32) #2 o (132) H 3 ) = (P m (T),H! o (12)(3) H 2 o (i2){3) H 3 ), 

(P w ( T )> #i ( 123 ) H 2 o (123) H 3 ) = (P (23) (T) , Hi o (i2)(3) H 2 o (i2){3) H 3 ), 

and 

( P (1 3 )( T )' H 1 °(13)(2) H 2 °(13)(2) H Z) = ( P (1 3) ( T )> H 1 °(1)(2)(3) #2 ° (1)(2)(3) ^3), 

( P (23)( T )l H l °(1)(23) ^2 ° (1)(23) i?3> = (P ( 23)( T )l H l °(1)(2)(3) ^2 ° ( 1)(2)(3) 

Finally, for any two permutations o~i, o~ 2 on N3 we have 

(P (123) (T),Hi o ai H 2 o n H 3 ) = (P (123) (T),Hi o a2 H 2 o CT2 H 3 ). 

Example 3.4 In this example we demonstrate that Formula (JIJ for the first 
derivative of a spectral function, at X, can be rewritten in a different form. 
Let X = U(Diag X(X))V T and E = V T EV, where E is a symmetric matrix. 
Using the definitions and notation in the previous subsection we have: 

V(fo\)(X)[E] = (V(Bmg\/f(ti))V T ,E) 
= (V/(/i),diag£> 
= <V/(ai),o S>. 
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Example 3.5 Let X be a symmetric matrix with ordered spectral decom- 
position X = y(Diag \{X))V T . Take two symmetric matrices E\ and E 2 
and let E { = V T E { V for i = 1,2. As we saw in the examples in Section [21 we 
have: 

E\ o (1)(2) E 2 = (diag£i)(diag£ 2 ) T and E x o (u) £ 2 = £i° 

Then Formula (J2J) for the Hessian of the spectral function / o A becomes: 

V 2 (/ o \)(X)[E 1 ,E 2 ] = V 2 /(A(X))[diag J B 1 ,diag J E 2 ] + (A(X{X)), E x o E 2 ) 

= (^/(A^)),^ o (1)(2) E 2 ) + (AiXiX)),^ o (i2) E 2 ). 

All these examples support the following conjecture, which describes the 
structure of the higher-order derivatives of spectral functions. 

Conjecture 3.1 The spectral function foXisk times (continuously) differ- 
entiable at X if and only of f(x) is k times (continuously) differentiable at 
the vector X(X). Moreover, there are fc-tensor valued maps A a : IR n — > T k,n , 
a G P k , such that for any symmetric matrices Ei,...,Ek we have 

V fc (/ o X)(X)[E U E k ] = (MKX)),Ei ° CT • • • o CT E k ), 

<j£P k 

where X = y(Diag X(X))V T and E { = V T E t V , for % = 1, .., k. 

In ^3] we will show that this conjecture holds for the derivatives of any 
spectral function at a symmetric matrix X with distinct eigenvalues, as well 
as for the derivatives of separable spectral functions at an arbitrary symmet- 
ric matrix. (Separable spectral functions are those arising from symmetric 
functions f(x) = g{x\) + ■ ■ ■ + g(x n ) for some function g on a scalar argu- 
ment.) There we also describe how to compute the operators A a for every a 
in P k . 

There is one major draw-back of the conjectured formula above. On the 
left hand-side we have the the fc-th derivative of the spectral function eval- 
uated at the matrices E\ , . . . , E k while on the right-hand side these matrices 
are "jumbled" with the orthogonal matrix V into the cr-Hadamard products 
Ei o a ■ ■ ■ o a Ek- This is the problem that we address in the next section. 
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4 The Diag a operator 

Recall that the adjoint of the linear operator Diag : lR n — > M n is the operator 
diag : M n — > W 1 . That is, we have the identity 

(5) (Diagx,H) — (x, diag if), 

for any vector x and any matrix H. It is also easy to verify that for any 
vector x, matrix H, and orthogonal matrix U we have 

(6) (U(Diagx)U T ,H) = (x, diag (U T HU)). 

Vector x can be viewed as a 1-tensor on IR n given through the linear isometry 
x — > (x, •) and similarly Diagx can be viewed as a 2-tensor. In this section 
we will generalize Equations (0) and © for an arbitrary fc-tensor in place of 
x and arbitrary u-Hadamard product in place of diag . 

Let T be an arbitrary fc-tensor on W 1 and let a be a permutation on Nk- 
We define DiagT to be a 2/c-tensor on IR n in the following way 



(l)iag' ! V)'- : ^ 



2*...* ifi, = j ffW ,Va = l,...,A;, 
0, otherwise. 



When k — 1 and a is the only choice from P 1 , namely a = (1), then this 
definition coincides with the definition of the Diag operator in Equation (jHJ). 
Equivalent way to define DiagT that is useful for calculations is: 

We now define an action of the group, O n , of all n x n orthogonal matrices 
on the space of all ^-tensors on IR n . For any A;-tensor T, and U G O n this 
action will be denoted by UTU T , and defined by: 



*kVk 



(7) (UTU T ) ii - ik = ■■ ■ Yl (r pi - ph u iin ■■■u 

Pl=l p*=l 

In the case = 1, when T is viewed as an n-dimensional vector, this is 
exactly the action of the orthogonal group on W 1 : 

n 

(UTU T ) h = (UT) h = Y U hpi T Pl . 

pi=i 
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In the case k = 2 the definition coincides with the conjugate action of the 
orthogonal group on the set of all n x n square matrices: 

n,n 

{UTU T ) ij = T pq U ip U jq , 

p,q=l 

hence the use of the same notation for the general action UTU T . For future 
reference we state the formula of the action in the case when the tensor is of 
even order. That is, if T is a 2/c-tensor, then 

n,n n,n k 

(8) (UTU T )h--t= ■■■ (T%--tY[u l » p »U j » q »y 

Pl,(Jl=l Pfc,<?fc = l v = \ 

Let P be an n x n permutation matrix and a its corresponding permuta- 
tion on N n , that is, P T e l = for all i = 1, ...,n, where {e % \ i = 1, ...,n} is 
the standard basis in W 1 . The action of P on the tensors will be given by: 

n n k 

(PTP T )* 1 '"* fc = • ■ ■ ^fPi—Pk J~J piuPvj = rpa{h)...u{i k ) 

Pl=l Pfc=l 

That is, the conjugate action of a permutation matrix on a fc-tensor is what 
one expects it to be. We have the following immediate observation. 

Lemma 4.1 For any permutation \i on Nk, any permutation matrix P in 
P n and any k-tensor T on W 1 , we have 

P(Diag M T)P T = Diag At (PTP T ). 

Proof. Let a be the permutation on N n corresponding to P. Fix any multi 
index We begin calculating the right-hand side entry corresponding 

to that index. In the third equality below, we use the fact that a is a one- 
to-one map. 

h...i k a(ii)...a(i k ) 

(P(Diag M T)P T )ii-^ = (DiagT)^)"^'*) 

ji<r(il)...a(ik)x . . .. . . . . X ,, ... . 

_ rpa(ii)...a(i k ) c r 

= (PTP ) 1 kX iij IJj ^ " " '^i fe j M (fe) 



(Diag^(PTP T ))ji~^ 



A natural question to ask is whether the action defined above on the 
space T k,n is associative. 

Lemma 4.2 for any k-tensor, T, on W 1 and any two orthogonal matrices 
U, V in O n we have 

V(UTU T )V T = (VU)T(VU) T . 

Proof. The proof is a direct calculation using the definitions. On one hand 
we have 

n n k 

pi=i Pfc=i u=i 

n n n n k k 

=£•£((£•"£ n u"'") n ■ 

Pi=l Pfc=l h=l /fc=l M=l "=1 

On the other hand we have 

ran A: 

((y[/)T(v?7) T ) ii - ifc = • ■ ■ Yl Th "' lk n^)^- 

Using that 

n 

(VC/)*'*''' = y i ^Vp,jjv^^ ^ 

Pp=l 

we get 

fc fc ra n ra fc 

fj,=l fl=l p M =l pi=l Pfc = l £t=l 

Putting everything together and observing that we can exchange the multiple 
sum £)pi=i • • • =1 w ith the multiple sum £)h=i ' ' ' XT=i we finish the 
proof of the lemma. ■ 

Let us see now that conjugation with an orthogonal matrix is orthogonal 
transformation on T k,n . That is, it doesn't change the norm of the tensor. 
In other words, if we define 

||T|| := V(T,T), 
then we have the following lemma. 



16 



Lemma 4.3 Let T be a k-tensor on M. n , and U be any orthogonal matrix in 
O n , then 

\\UTU T \\ = \\T\\. 
Proof. Direct calculation of the quantity ||£7Ti7 T || 2 gives: 
\\UTU T \\ 2 = (UTU T ,UTU T ) 

n,...,n 

= (UT^y^iUTU 1 ")^ 



-Ik 



»ii-")i*=l 

n,...,n n,...,n n,...,n 



...,i k =l p!,...,p k =l qi,...,q k =l 
n,...,n n,...,n n n 

r pVl--Pk r pqi--qk^ Tjhpijjhqi^j • ■ ■ ^ JjikPkJjikQk^j 



pi,...,p k =lq lr ..,q k =l ii=l i fe =l 

n,...,n n,...,n 

~~ / ,, / > - 1 "Pl<?i u Pfc<?fc 

pi,...,p fc =l <ji,...,q fc =l 
n,...,n 

pi,...,p fc =l 

= ll T H 2 - " 

After all these preparations, we can give the following generalization to 
Equation ©. (When, k = 1 and cr = (1) we obtain exactly Equation ©.) 

Theorem 4.4 For any k-tensor T, any matrices H\,...,Hk, any orthogonal 
matrix U , and any permutation a on we have the identity 

(9) (T,H 1 o a ...o a H k } = (U(Dmg°T)U T )[H u ...,H k ], 

where Hi = U T HiU , for all i = 1,2, k. 

Proof. Since both sides are linear in each argument separately, it is enough 
to show that the equality holds for /c-tuples (H^, Hi k j k ) of basic matrices. 

Using Lemma 12.21 and the fact that Hf? = U tp U^ q , we develop the left- 
hand side of Equation Q: 

n,...,n 

r ~VPi?> -i m T-rVkV„-\ 



IT H ■ o • • • o H ■ ) = \ TPi-Ph u PlP ° . . . fj 



°-Hk) 



lk]k 

Pl,...,p k =l 
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_ J^Pi--PkJjilPiJjjiP^-l W ...JjikPkJjjkP a -l {k) _ 
pi,...,p fc =l 

On the other hand, using the definitions we calculate that the right-hand 
side is: 

(U(Diag°T)U T )[H lin ,...,H lkJk } = 

n,n n.n 

= ([/(Diag CT T)t/ T )ii ::: it 

n,n n.n k 

= E E ((Diag CT T)S"^J^[[/^C/> 9 ^ 

Pl,(Jl = l Pfc,9fc = l f=l 

— ^ . . . ^ fj?>l-P* J^J JJ^P^U^Pa-T-i^ _ 
Pl=l Pfc=l ^ = 1 

This shows that the both sides are equal. ■ 

If we take the orthogonal matrix U to be the identity matrix we obtain 
the following corollary. 

Corollary 4.5 For any k-tensor T, any matrices H\,...,Hk, and any per- 
mutation a onNk, we have the identity 

(10) (T,H 1 o a ...o a H k ) = (Diag'TMFi, ...,#*]. 

If in Corollary 14. 51 we substitute the matrices Hi,...,Hk with Hi,.. .,11% and 
we use Theorem 14.41 we obtain the next result. 

Corollary 4.6 For any k-tensor T , orthogonal matrix U G 0(n), permuta- 
tion cr on Nk, and any matrices H\,...,Hk we have the identity 

(11) (Diag CT T)[iJ 1; ...,# fc ] = (U(Bi Sk g' T T)U T )[H 1 ,...,H k \. 

If in Corollary 14.51 we take a to be the identity permutation, then we get 
the next corollary, which generalizes Equation (JHJ). 
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Corollary 4.7 For any k-tensorT , any matrices Hi,...,Hk we have the iden- 
tity 

(12) r[diag#i,...,di&g# fc ] = (Diag^T)^,.. .,#,]. 

We conclude this section with a second look at the first two derivatives 
of spectral functions. 

Example 4.8 As we saw in Example 13.41 the first derivative of the spectral 
function / o A at the point X = V (Di&g X(X))V T , applied to the symmetric 
matrix E is given by the formula 

V(/oA)(X)[]?] = (V/W,o (1) £), 

where E = V T EV. This formula can be rewritten as 

V(/o X)(X)[E] = (V(piagVf(fi))V T ,E) = y(DiagV/(/i))y T [£]. 

This was essentially the original form of this formula given in Equation 

The usefulness of the new notation becomes more evident below. 

Example 4.9 Let X be a symmetric matrix with ordered spectral decom- 
position X = V(Di&g \(X))V T . Take two symmetric matrices E\ and E2 
and let E { = V T E { V for i = 1, 2. As we saw in Example 13.41 the Hessian 
of the spectral function / o A at the point X = y(Diag X(X))V T , applied to 
the symmetric matrices E\ and E 2 is given by the formula 

V 2 (/ o A) (X) [Ei, E 2 ] = (V 2 /(A(X)), El o {i)(2) E 2 ) + (A(X(X)), E x o (i2) E 2 ). 

With the notation introduced in this section we can rewrite it as 

V 2 (/o X)(X)[E 1} E 2 ] = (y(Diag( 1 )( 2 )V 2 /(A(X)))y T )[i? 1 ,i? 2 ] 

+ (\/(Diag( 12 )^(A(X)))^ T )[i? 1 , J E 2 ]. 

Or, in other words 

V 2 (/ o X)(X) = F(Diag W^V 2 /(A(X)) + Diag ^A{X(X)))V T . 
Finally, we express Conjecture 13. II in the new language. 
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Conjecture 4.1 The spectral function foXisk times (continuously) differ- 
entiable at X if, and only if, f(x) is k times (continuously) different iable at 
the vector X(X). Moreover, there are fc-tensor valued maps A a '■ K ra — >• T k,n , 
a E P k , such that 

(13) V fc (/ o \)(X) = V(J2 Diag CT X(A(X)))y T , 

a£P k 

where X = ^(Diag X(X))V T . 

In ^3] we will show that this conjecture holds for the derivatives of any 
spectral function at a symmetric matrix X with distinct eigenvalues, as well 
as for the derivatives of separable spectral functions at an arbitrary symmet- 
ric matrix. (Separable spectral functions are those arising from symmetric 
functions f(x) = g(xi) + • • ■ + g(x n ) for some function j on a scalar ar- 
gument.) There we also describe how, for every a in P k , to compute the 
operators A a , that depend only on the symmetric function /. 

5 Comments on Conjecture 14.11 

In this section we show that once Conjecture 14.11 is established for k — 1, 
then for k > 2 it is enough to prove it only in the case when the X = Diag x 
for some x G lR n with x\ > • • • > x n . We begin with a simple lemma. For 
brevity, given a A;-tensor, T, on M n by T[H] we denote the (k — l)-tensor 
T[;...,H}. 

Lemma 5.1 Let T be any 2k-tensor on R n , U € O n , and let H be any 
matrix. Then, the following identity holds. 

U{T[H])U T = (UTU T )[H], 

where H = U T HU. 

Proof. Since both sides are linear with respect to H, it is enough to prove 
the identity only for basic matrices H ik j k . By the definition of conjugation, 
and using the fact that Hf q - = U lkP W kq we obtain 

{U{T[H lk3k })U T )tl'-\ 
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(r[fl ijkifc ])« ::: «fc-ii7 <ipi c7' ii91 • • • u ik - iph - i u jk - iQk - 1 



Ps,9s = l 

s=l,...,fe-l 



...lit. . . . . 

Tqi.- qkU nP1 U nqi ■ ■ ■ U %kPk U 0kqk 



Ps,q s =l 
s=l,...,k 



n. ..ik 



{UTU T )h-h 
{(UTU T )[H lkK ]) 



ii—ik-i 
h—Oh-i 



Suppose that Conjecture 14. II holds for all derivatives of order less than k 
and for the fc-th derivative it holds only for ordered diagonal matrices. We 
will show that the conjecture holds for the fc-th derivative at an arbitrary 
matrix. Indeed, let X = y(Diag X(X))V T , let E be arbitrary symmetric 
matrix and denote E = V T EV. Then 

y fc - 1 F(X + E) = V fc_1 F(y(DiagA(X) + E)V T ) 

= K(V* -1 F(DiagA(X) + E))V T 

= V(V k - 1 F(Di&g\(X)))V T + V(V k F{DmgX{X))[E])V T + o(\\E\\) 

= V fc - X F(X) + (U(V fe F(DiagA(X)))y T )[F] + o(\\E\\)- 

This shows that V fc_1 F is differentiable at X and that V(V k F(Diag X(X)))V T 
is the fc-th derivative of F at X. 

Proposition 5.2 Suppose the k-th derivative of the spectral function F = 
foX is given by Equation Mty) for all X. If for every a G P k the tensor valued 
map x G W 1 — > A a (x) G T k,n is continuous, then V k F(X) is continuous in 
X, in other words F G C k . 

Proof. Suppose that there is a sequence of symmetric matrices X m ap- 
proaching X and an e > such that 

||V fc F(X m ) - V fc F(X)|| > e, for all m. 

Let X m = V^(Diag X(X m ))V£ and suppose without loss of generality that 
the orthogonal V m approaches V. (Otherwise, take a subsequence.) Clearly, 
we have X = ^(Diag X(X))V T , and by continuity of eigenvalues X(X m ) ap- 
proaches X(X). Using the formula for the Ar-th derivative and the continuity 
of the tensorial maps, the contradiction follows. ■ 
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6 Equivalence relations on N n 

Suppose that ~ is an equivalence relation on the integers N n and denote by 
I\, l2,-..,I r the equivalence classes determined by ~. The equivalence classes 
will be also called blocks. One may assume that the blocks are numbered so 
that I\ contains the integer 1, I2 contains that smallest integer not in Ii, I3 
contains the smallest integer not in I\ U I2, and so on. 

In this short section, we will be interested in tensors having the following 
structure. 

Definition 6.1 We say that a tensor T £ T h,n is block- constant (with respect 
to the equivalence relation ~) if 

T h-i k = T h-j^ whenever i s ~ j s for all s = 1,2, k. 

Let /i be an arbitrary but fixed permutation in P k . We introduce the 
linear operator P^ on the space T k,n , generalizing the operator P M defined in 
Sectional The definition is element- wise, as follows: 



\H...l h 



T n - lk , i£i a ~i Ka) VseN k , 
0, otherwise. 



Clearly, when the equivalence relation ~ is such that i ~ j if, and only 
if, i — j, then becomes equal to the previously defined P M . We would like 
to conclude this work with a generalization of Theorem 13.11 

Theorem 6.2 Let o~\, 02, and fi be three permutations in P k . Then for 
any block- constant matrices H\,...,Hk, and any tensor T in T k,n we have the 
identity: 

(P^T),^ o CT1 ■ ■ ■ o CT1 H k ) = (P M (T), H, o a2 ... o a2 H k ) 
if, and only if, [i -< a^ 1 o a\. 

Proof. The proof is completely analogous to the one of Theorem 13.11 
Consider a basis for the space of block constant matrices {H pq : 1 < p, q < n} 
such that H l J q is equal to one, if i ~ j, and zero otherwise. Then all we have 
to change in the proof of Theorem 13 .11 is the "=" signs between indexes with 
"~" signs and all "7^" signs with "<*<" . ■ 
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